Multi-level access control for distributed storage systems

ABSTRACT

System and method for accessing a distributed storage system uses a storage-level access control process at a distributed file system that interfaces with the distributed storage system to determine whether a particular client has access to a particular first file system object using an identifier of the particular client and storage-level access control rules in response to a file system request from the particular client to access a second file system object in the particular first file system. The storage-level access control rules are defined for a plurality of clients and a plurality of first file system objects of the distributed storage system to allow the particular client access to the second file system object in the particular first file system only if the particular client has been determined to have access to the particular first file system object according to the storage-level access control rules.

BACKGROUND

Currently, there is an unprecedented need for scalable high performancestorage and data management, partly due to the wide use of cloudcomputing. Large distributed storage systems have been developed tosatisfy this need for scalable high performance storage and datamanagement. Some of these large distributed storage systems may supporta number of tenants, each of which may include one or more clients. Suchlarge distributed storage systems provide isolated storage services tothe different tenants. In order to provide these isolated storageservices, the distributed storage systems utilize authentication schemesso that each tenant can access the storage services dedicated to thattenant. Some of the distributed storage systems require each tenant touse a provider specific authentication scheme.

Although isolated storage services are desired for most situationsinvolving tenants, there are situations where it may be desirable toshare at least some of the storage services for to a particular tenantwith clients external to that particular tenant. In addition, having thetenants convert to a provider specific authentication scheme is ofteninconvenient and sometimes impossible.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a distributed computer system in accordancewith an embodiment of the invention.

FIG. 2 is a block diagram of a host computer that can support clients ofthe distributed computer system in accordance with an embodiment of theinvention.

FIG. 3 is a diagram illustrating different clients of the distributedcomputer system trying to access file system objects in a file systemvolume in accordance with an embodiment of the invention.

FIG. 4 is a block diagram of the distributed computer system of FIG. 1in accordance with one implementation.

FIG. 5 is a process flow diagram of a storage access operation of thedistributed computer system 100 in accordance with an embodiment of theinvention.

FIG. 6 is a flow diagram of a method for accessing a distributed storagesystem in accordance with an embodiment of the invention.

Throughout the description, similar reference numbers may be used toidentify similar elements.

DETAILED DESCRIPTION

FIG. 1 depicts a distributed computer system 100 in accordance with anembodiment of the invention is shown. The distributed computer systemincludes multiple tenants 102 with one or more clients 104, adistributed file system 106 and a distributed storage system 108. Asused herein, a “client” can be any software entity that can run on acomputer system, such as a software application, a software process, avirtual machine (VM) or a “virtual container” that provides system-levelprocess isolation. The distributed storage system can be accessed by theclients of the various tenants via the distributed file system toperform data operations on the distributed storage system. As describedin more detail below, the distributed computer system provides amulti-level access control for the distributed storage system so thatspecific clients have access to specific file objects stored in thedistributed storage system for specific file system operationsregardless of the tenants to which the specific clients belong.

As shown in FIG. 1, the distributed storage system 108 includes computerdata storage devices 110, input/output (IO) servers 112 and metadataservers 114. The distributed storage system is scalable, and thus, thenumber of data storage devices, IO servers and metadata servers includedin the storage system can be changed as needed to increase or decreasethe capacity of the storage system to support increase/decrease inworkload. Consequently, the exact number of data storage devices, IOservers and metadata servers included in the storage system can varyfrom tens to hundreds or more.

The data storage devices 110 of the distributed storage system 108 canbe any type of non-volatile storage devices that are commonly used fordata storage. As an example, the data storage devices may be, but notlimited to, solid-state devices (SSDs), hard disks or a combination ofthe two. The storage space provide by the data storage devices isdivided into storage blocks 116, which may be disk blocks, disk sectorsor other storage device sectors.

The IO servers 110 of the distributed storage system 108 operate tofacilitate data operations with respect to the data storage devices. TheIO servers may manage low-level data storage tasks, such as requestscheduling and data layout. In some embodiments, the IO servers mayorganize data and present a simple object-based data access interface tothe rest of the distributed computer system 100.

The metadata servers 114 of the distributed storage system 108 operateto facilitate metadata operations associated with the storage blocks 116of the data storage devices 110, including metadata that indicates whichstorage blocks of the data storage devices have been allocated and whichstorage blocks of the data storage devices are free or available forallocation. This type of metadata is sometimes referred to herein asstorage block allocation metadata. In some embodiments, the metadataservers are the same IO servers described above. The ability to separatemetadata and IO data paths opens doors to further performanceoptimizations, as the access patterns of metadata and data are usuallydistinct.

The distributed file system 106 operates to present storage resources ofthe distributed storage system 108 as file systems, which includehierarchies of file system objects, such as file system volumes, filedirectories, folders and files, to the different clients 104 for sharedaccess. Thus, the distributed file system organizes the storageresources of the distributed storage system into the file systems sothat the clients can access the file system objects for various filesystem operations, such as creating file objects, deleting file objects,writing or storing file objects and reading or retrieving file objects.

The distributed file system 106 includes a storage-level access controlmechanism 118. The storage-level access control mechanism provides apart of the multi-level access control of the distributed computersystem 100. The storage-level access control mechanism operates tocontrol access with respect to the clients and the file system objects,such as file system volumes, which is referred to herein as astorage-level access control process. The storage-level access controlmechanism may limit access to one or more file system operations, suchas read operations, write operations and create file object operations.In other words, the storage-level access control mechanism may definefile system operations that can be performed for a client that hasaccess to a file system object, i.e., an access relationship between oneclient and one file system object. This parameter is referred to hereinas an access capability. The storage-level access control mechanism usesstorage-level access control rules that specify which clients can accesswhich file system objects with which access capabilities.

In an embodiment, the storage-level access control mechanism 118 usesclient sets, file system object sets and the access capabilities toderive the storage-level access control rules. In this embodiment, thefile system object sets will be described as sets of file systemvolumes, or volume sets. A volume set is defined as an arbitrary set offile system volumes. One file system volume can belong to any number ofvolume sets. Similarly, a client set includes an arbitrary set ofclients and one client can belong to any number of client sets. A clientset can contain other client sets. Between every client set and volumeset, an access capability is defined. In this embodiment, a client cansee the file system volume mounted inside the mounting point (e.g./mnt/dfs) only if it has access to the volume. The information regardingthe client sets, the file system object sets and the access capabilitiesmay be maintained in a storage-level access control database 120, whichmay be stored in the root file system volume of the distributed filesystem 106. Since the clients 104 are mounted to the distributed storagesystem 108 via the distributed file system 106 and the file systemobjects are known to the distributed file system, the storage-levelaccess control mechanism can determine the clients associated withstorage access requests using client identifications, such as VMidentification (VMID), and determine the target file system objects,such as file system volumes, associated with the requests to provideeffective access control.

As noted above, each of the tenants 102 includes one or more clients104, which can access the distributed storage system 108 via thedistributed file system 106. The tenants may belong differentorganizations or companies and the clients of the tenants may executeapplications or other software programs for the organizations orcompanies. The clients of each tenant may be running in one or more hostcomputers, which may be located at different locations. An example ofsuch host computer is shown in FIG. 2.

FIG. 2 shows a host computer 200 that can support a number of clients220A, 220B . . . 220L (where L is a positive integer). In thisembodiment, the clients are VMs. However, in other embodiments, theclients supported by the host computer may “virtual containers” or otherprocessing entities. The number of VMs supported by the host computercan be anywhere from one to more than one hundred. The exact number ofVMs supported by the host computer is only limited by the physicalresources of the host computer. The VMs share at least some of thehardware resources of the host computer, which include one or moresystem memories 222, one or more processors 224, a storage interface226, and a network interface 228. In FIG. 2, the physical connectionsbetween the various components of the host computer are not illustrated.Each system memory 222, which may be random access memory (RAM), is thevolatile memory of the host computer. Each processor 224 can be any typeof a processor, such as a central processing unit (CPU) commonly foundin a server. In some embodiments, each processor may be a multi-coreprocessor, and thus, includes multiple independent processing units orcores. The storage interface 226 is an interface that allows that hostcomputer to communicate with storage. As an example, the storageinterface may be a host bus adapter or a network file system interface.The network interface 228 is an interface that allows the host computerto communicate with other devices connected to the same network. As anexample, the network interface may be a network adapter.

In the illustrated embodiment, the VMs 220A, 220B . . . 220L run on“top” of a hypervisor 230, which is a software interface layer that,using virtualization technology, enables sharing of the hardwareresources of the host computer 200 by the VMs. However, in otherembodiments, one or more of the VMs can be nested, i.e., a VM running inanother VM. Any computer virtualization architecture can be implemented.For example, the hypervisor may run on top of the host computer'soperating system or directly on hardware of the host computer. With thesupport of the hypervisor, the VMs provide isolated execution spaces forguest software. Each VM may include a guest operating system 232 and oneor more guest applications 234. The guest operating system managesvirtual system resources made available to the corresponding VM by thehypervisor, and, among other things, the guest operating system forms asoftware platform on top of which the guest applications run.

Turning back to FIG. 1, each tenant 102 includes a client-level accesscontrol mechanism 122. The client-level access control mechanismprovides another part of the multi-level access control of thedistributed computer system 100. The client-level access controlmechanism operates to provide fine grained access control at the clientsof tenants, which is referred to herein as a client-level access controlprocess. In particular, the client-level access control mechanismprovides a finer-grained access control than the storage-level accesscontrol mechanism. As an example, the client-level access controlmechanism may provide access to a file or file directory, while thestorage-level access control mechanism may provide access to the filesystem volume under which that file or file directory belongs. Thus, thestorage-level access control mechanism provides access control to one ormore file system objects and the client-level access control mechanismprovide access control to a file system object in one of these filesystem objects.

The distributed computer system 100 allows each tenant or a set ofclients within a tenant to choose the type of client-level accesscontrol mechanism or no client-level access control mechanism to be usedfor the respective clients. Thus, the client-level access controlmechanisms 122 utilized in the tenants and/or sets of clients can bedifferent. In some embodiments, the client-level access controlmechanisms use access control list schemes. As an example, theclient-level access control mechanisms may use local passwords,Lightweight Directory Access Protocol (LDAP), OpenLDAP, Active Directoryor other known authentication means to provide client-level accesscontrol.

One example of a client-level access control process involves an inkernel file system, such as Linux v9fs, which is a Plan 9 File Protocol(9P) client. The authentication is enforced by client kernel (login vialocal password file or LDAP). The authorization is enforced by a clientfile system such as v9fs by interpreting the per object metadata storedin the underlying storage system, i.e., the distributed storage system108. In particular, the 9P client of the distributed file system storesPortable Operating System Interface (POSIX) Access Control Lists (ACLs)as extended attributes in the distributed file system 106.

The combination of client-level access control and storage-level accesscontrol provides a secure sharable file system that is both flexible andscalable. In particular, the file system of the distributed computersystem 100 can support numerous sets of clients or tenants.Additionally, the file system can support sets of clients or tenantsthat use different client-level access control mechanisms, which may beOS dependent. Thus, the file system allows tenant administrators tochoose the client-level access control mechanisms for their set or setsof clients.

FIG. 3 illustrates the multi-level access control of the distributedcomputer system 100. In this figure, a number of file system volumes 300organized by the distributed file system is shown. As shown in FIG. 3,the clients 1, 2 and 3 in the tenant A and the clients 4 and 5 in thetenant B are trying to access file system objects in the file systemvolume B. The client 2 of the tenant A and the client 5 of the tenant Bdid not pass through both the client-level access, which is controlledby the client-level access control mechanism 122 at the respectivetenant, and the storage-level access, which is controlled by thestorage-level access control mechanism 118 in the distributed filesystem 106, and thus, are able to access the desired file object, suchas a file directory or a file, in the file system volume B. The clients1 and 3 of the tenant A and the client 4 of the tenant B have passedthrough both the client-level access and the storage-level access, andthus, are able to access the desired file object, such as a filedirectory or a file, in the file system volume B. However, the clients 1and 3 of the tenant A have full access, while the client 4 of the tenantB has read-only access. These access capabilities of the clients 1 and 3of the tenant A and the client 4 of the tenant B are controlled by thestorage-level access control, as explained above with respect to thestorage-level access control mechanism. Thus, using the multi-levelaccess control of the distributed computer system 100, clients ofdifferent tenants can access file system objects in the same file systemvolume. One use of this capability would be for one tenant to allowread-only access to clients external to that tenant for researchpurposes.

Turning now to FIG. 4, one particular implementation of the distributedcomputer system 100 is illustrated. In FIG. 4, the data storage devices100, as well as other IO and metadata servers, of the distributedstorage system 108 are not shown. In this implementation, thedistributed file system includes a number of distributed file system(DFS) modules 402, each of which includes a 9P server 404, a DFS client406 and a DFS server 408. In this implementation, the 9P server of eachDFS module is embedded with the DFS client of that DFS module. The 9Pservers of the DFS modules communicate with 9P clients 410, which arerunning in each of the clients 104. Each DFS module and the associatedclients 104 are running on the same host computer, e.g., the hostcomputer 200. In this implementation, the client-level access controlmechanisms 122 are facilitated by the clients 104 of the differenttenants 102, as explained above. The storage-level access controlmechanism 118 is facilitated by the DFS clients using the storage-levelaccess control database 120, as explained below.

When one of the clients 104 wants to access a file system object, the 9Pclient 410 of that client sends a storage access request to the 9Pserver 404 in the same host computer as the client. The storage accessrequest may include a file system operation being requested and anidentification of the requesting client. The storage access request isthen processed by the DFS client 406. The DFS client may request filemapping information from the DFS server 408 that communicates with themetadata server 114, which handles storage metadata for the target filesystem volume of the storage access request. The DFS server may residein a different host computer from the host computer in which the DFSclient resides. The DFS client also enforces the storage-level accesscontrol by doing a metadata lookup on the storage-level access controldatabase 120. If the requesting client has access to the target filesystem volume and the appropriate access capability, the storage accessrequest is transmitted from the DFS client to the IO server that handlesthe target file system volume to execute the storage access request toget results, which are transmitted back to the requesting client.

Various components of the distributed computer system 100, including theIO servers 112 and the metadata servers 114, the DFS modules 402 and the9P clients 410, may be implemented in any combination of software,hardware and firmware. In some embodiments, at least some of thesecomponents are implemented as one or more software programs running onone or more physical computer systems with one or more processors,memory and other computer components commonly found on a personalcomputer or a physical server.

A storage access operation of the distributed computer system 100 inaccordance with an embodiment of the invention is now described withreference to the process flow diagram of FIG. 5. At block 502, a userenters an authentication credential to access a file system volume inthe distributed storage system 108 using one of the clients 104. As anexample, the user may enter the authentication credential using acomputer system connected to a network of one of the tenants 102 that isconnected to the client. Next, at block 504, the client-level accesscontrol mechanism used at that tenant determines whether the user hasaccess to a file system object in the file system volume. Next, at block506, the client being used by the user sends a file system requestdirected to the file system volume via the distributed file system 106.The file system request may include a file system operation beingrequested and the identification of the requesting client. Next, atblock 508, the storage-level access mechanism at the distributed filesystem determines whether the client has access to the file systemvolume using storage-level access control rules, which have beenpreviously defined. In one implementation, this access determinationstep involves examining a storage-level access control database, whichincludes client-volume access relationships and an access capability foreach client-volume access relationship. Next, at block 510, thedistributed file system executes a storage operation defined in therequest on the file system volume only if access is granted by both theclient-level access control mechanism and the storage-level accessmechanism.

A method for accessing a distributed storage system in accordance withan embodiment of the invention is now described with reference to theprocess flow diagram of FIG. 6. At block 602, storage-level accesscontrol rules are defined for a plurality of clients and a plurality offirst file system objects of the distributed storage system. At block604, a file system request is received at a distributed file system froma particular client among the clients to access a second file systemobject in a particular first file system object among the first filesystem objects. At block 606, a storage-level access control process isperformed, at the distributed file system, to determine whether theparticular client has access to the particular first file system objectusing an identifier of the particular client and the storage-levelaccess control rules. At block 608, access is allowed for the particularclient to the second file system object in the particular first filesystem only if the particular client has been determined to have accessto the particular first file system object according to thestorage-level access control rules.

Although the operations of the method(s) herein are shown and describedin a particular order, the order of the operations of each method may bealtered so that certain operations may be performed in an inverse orderor so that certain operations may be performed, at least in part,concurrently with other operations. In another embodiment, instructionsor sub-operations of distinct operations may be implemented in anintermittent and/or alternating manner. Also, some of the steps can berepeated multiple times.

It should also be noted that at least some of the operations for themethods may be implemented using software instructions stored on acomputer useable storage medium for execution by a computer. As anexample, an embodiment of a computer program product includes a computeruseable storage medium to store a computer readable program that, whenexecuted on a computer, causes the computer to perform operations, asdescribed herein.

Furthermore, embodiments of at least portions of the invention can takethe form of a computer program product accessible from a computer-usableor computer-readable medium providing program code for use by or inconnection with a computer or any instruction execution system. For thepurposes of this description, a computer-usable or computer readablemedium can be any apparatus that can contain, store, communicate,propagate, or transport the program for use by or in connection with theinstruction execution system, apparatus, or device.

The computer-useable or computer-readable medium can be an electronic,magnetic, optical, electromagnetic, infrared, or semiconductor system(or apparatus or device), or a propagation medium. Examples of acomputer-readable medium include a semiconductor or solid state memory,magnetic tape, a removable computer diskette, a random access memory(RAM), a read-only memory (ROM), a rigid magnetic disc, and an opticaldisc. Current examples of optical discs include a compact disc with readonly memory (CD-ROM), a compact disc with read/write (CD-R/W), a digitalvideo disc (DVD), and a Blu-ray disc.

In the above description, specific details of various embodiments areprovided. However, some embodiments may be practiced with less than allof these specific details. In other instances, certain methods,procedures, components, structures, and/or functions are described in nomore detail than to enable the various embodiments of the invention, forthe sake of brevity and clarity.

The components of the embodiments as generally described in thisdocument and illustrated in the appended figures could be arranged anddesigned in a wide variety of different configurations. Thus, thedetailed description of various embodiments, as represented in thefigures, is not intended to limit the scope of the present disclosure,but is merely representative of various embodiments. While the variousaspects of the embodiments are presented in drawings, the drawings arenot necessarily drawn to scale unless specifically indicated.

The present invention may be embodied in other specific forms withoutdeparting from its spirit or essential characteristics. The describedembodiments are to be considered in all respects only as illustrativeand not restrictive. The scope of the invention is, therefore, indicatedby the appended claims rather than by this detailed description. Allchanges which come within the meaning and range of equivalency of theclaims are to be embraced within their scope.

Reference throughout this specification to features, advantages, orsimilar language does not imply that all of the features and advantagesthat may be realized with the present invention should be or are in anysingle embodiment of the invention. Rather, language referring to thefeatures and advantages is understood to mean that a specific feature,advantage, or characteristic described in connection with an embodimentis included in at least one embodiment of the present invention. Thus,discussions of the features and advantages, and similar language,throughout this specification may, but do not necessarily, refer to thesame embodiment.

Furthermore, the described features, advantages, and characteristics ofthe invention may be combined in any suitable manner in one or moreembodiments. One skilled in the relevant art will recognize, in light ofthe description herein, that the invention can be practiced without oneor more of the specific features or advantages of a particularembodiment. In other instances, additional features and advantages maybe recognized in certain embodiments that may not be present in allembodiments of the invention.

Reference throughout this specification to “one embodiment,” “anembodiment,” or similar language means that a particular feature,structure, or characteristic described in connection with the indicatedembodiment is included in at least one embodiment of the presentinvention. Thus, the phrases “in one embodiment,” “in an embodiment,”and similar language throughout this specification may, but do notnecessarily, all refer to the same embodiment.

Although specific embodiments of the invention have been described andillustrated, the invention is not to be limited to the specific forms orarrangements of parts so described and illustrated. The scope of theinvention is to be defined by the claims appended hereto and theirequivalents.

What is claimed is:
 1. A method for accessing a distributed storagesystem, the method comprising: defining storage-level access controlrules for a plurality of clients and a plurality of first file systemobjects of the distributed storage system; receiving a file systemrequest from a particular client among the clients to access a secondfile system object in a particular first file system object among thefirst file system objects at a distributed file system; performing astorage-level access control process, at the distributed file system, todetermine whether the particular client has access to the particularfirst file system object using an identifier of the particular clientand the storage-level access control rules; and allowing the particularclient access to the second file system object in the particular firstfile system only if the particular client has been determined to haveaccess to the particular first file system object according to thestorage-level access control rules.
 2. The method of claim 1, furthercomprising performing a client-level access control process to determinewhether a user has access to the second file system object in theparticular first file system object.
 3. The method of claim 2, whereinthe client-level access control process uses an access control listscheme.
 4. The method of claim 1, wherein performing the storage-levelaccess control process includes determining an access capability for anaccess relationship between the particular client and the particularfirst file system object using the storage-level access control rules.5. The method of claim 1, wherein the particular first file systemobject is a file system volume and the second file system object is afile directory or a file.
 6. The method of claim 1, wherein definingstorage-level access control rules includes defining sets of clients andsets of first file system objects and defining which of the sets ofclients have access to which of the sets of first file system.
 7. Themethod of claim 6, wherein defining storage-level access control rulesincludes defining an access capability for each set of clients that hasaccess to at least one of the sets of first file system objects.
 8. Acomputer-readable storage medium containing program instructions for amethod for accessing a distributed storage system, wherein execution ofthe program instructions by one or more processors of a computer systemcauses the one or more processors to perform steps comprising: definingstorage-level access control rules for a plurality of clients and aplurality of first file system objects of the distributed storagesystem; receiving a file system request from a particular client amongthe clients to access a second file system object in a particular firstfile system object among the first file system objects at a distributedfile system; performing a storage-level access control process, at thedistributed file system, to determine whether the particular client hasaccess to the particular first file system object using an identifier ofthe particular client and the storage-level access control rules; andallowing the particular client access to the second file system objectin the particular first file system only if the particular client hasbeen determined to have access to the particular first file systemobject according to the storage-level access control rules.
 9. Thecomputer-readable storage medium of claim 8, wherein the steps furthercomprises performing a client-level access control process to determinewhether a user has access to the second file system object in theparticular first file system object.
 10. The computer-readable storagemedium of claim 9, wherein the client-level access control process usesan access control list scheme.
 11. The computer-readable storage mediumof claim 8, wherein performing the storage-level access control processincludes determining an access capability for an access relationshipbetween the particular client and the particular first file systemobject using the storage-level access control rules.
 12. Thecomputer-readable storage medium of claim 8, wherein the particularfirst file system object is a file system volume and the second filesystem object is a file directory or a file.
 13. The computer-readablestorage medium of claim 8, wherein defining storage-level access controlrules includes defining sets of clients and sets of first file systemobjects and defining which of the sets of clients have access to whichof the sets of first file system.
 14. The computer-readable storagemedium of claim 13, wherein defining storage-level access control rulesincludes defining an access capability for each set of clients that hasaccess to at least one of the sets of first file system objects.
 15. Adistributed computer system comprising: a plurality of clients runningon host computers; a distributed storage system that includes aplurality of data storage devices; and a distributed file system thatinterfaces with the distributed storage system to provide a file systemfor the clients, the distributed file system that: receives a filesystem request from a particular client among the clients to access asecond file system object in a particular first file system object amongthe first file system objects at a distributed file system; performs astorage-level access control process to determine whether the particularclient has access to the particular first file system object using anidentifier of the particular client and storage-level access controlrules; and allows the particular client access to the second file systemobject in the particular first file system only if the particular clienthas been determined to have access to the particular first file systemobject according to the storage-level access control rules.
 16. Thesystem of claim 15, a set of the clients with the particular clientincludes a client-level access control mechanism that determines whethera user has access to the second file system object in the particularfirst file system object.
 17. The system of claim 16, wherein theclient-level access control mechanism uses an access control listscheme.
 18. The system of claim 15, wherein the distributed file systemdetermines an access capability for an access relationship between theparticular client and the particular first file system object using thestorage-level access control rules.
 19. The system of claim 15, whereinthe particular first file system object is a file system volume and thesecond file system object is a file directory or a file.
 20. The systemof claim 15, wherein the storage-level access control rules are definedusing sets of clients, sets of first file system objects, informationregarding which of the sets of clients have access to which of the setsof first file system, and information regarding an access capability foreach set of clients that has access to at least one of the sets of firstfile system objects.